AITopics

Country:

Asia > China > Sichuan Province > Chengdu (0.05)
Asia > China > Shaanxi Province > Xi'an (0.05)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Transportation (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Neural Information Processing SystemsFeb-16-2026, 19:28:31 GMT

96bbdd0ed2a9e7cd2fb7caf2fae15f3d-Paper-Conference.pdf

algorithm, artificial intelligence, machine learning, (14 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (0.67)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Neural Information Processing SystemsFeb-11-2026, 07:22:32 GMT

4786c0d1b9687a841bc579b0b8b01b8e-Supplemental-Datasets_and_Benchmarks.pdf

dataset, synthetic dataset, trajectory, (11 more...)

Country:

Asia > China > Shaanxi Province > Xi'an (0.08)
Asia > China > Sichuan Province > Chengdu (0.06)

Industry: Transportation (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Neural Information Processing SystemsFeb-10-2026, 08:16:04 GMT

wherethelastequalityissimplyarearrangementofterms. 14

We wish to optimize the likelihood of the sequence conditioned on the start and the goal frame p(o2:T 1|o1,T).

artificial intelligence, machine learning, sequence, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-9-2026, 17:26:56 GMT

876f1f9954de0aa402d91bb988d12cd4-Supplemental.pdf

backward process, diffflow, equation, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Bhole, Ajinkya, Filabadi, Mohammad Mahmoudi, Crevecoeur, Guillaume, Lefebvre, Tom

Unifying Entropy Regularization in Optimal Control: From and Back to Classical Objectives via Iterated Soft Policies and Path Integral Solutions

arXiv.org Artificial IntelligenceDec-10-2025

This paper develops a unified perspective on several stochastic optimal control formulations through the lens of Kullback-Leibler regularization. We propose a central problem that separates the KL penalties on policies and transitions, assigning them independent weights, thereby generalizing the standard trajectory-level KL-regularization commonly used in probabilistic and KL-regularized control. This generalized formulation acts as a generative structure allowing to recover various control problems. These include the classical Stochastic Optimal Control (SOC), Risk-Sensitive Optimal Control (RSOC), and their policy-based KL-regularized counterparts. The latter we refer to as soft-policy SOC and RSOC, facilitating alternative problems with tractable solutions. Beyond serving as regularized variants, we show that these soft-policy formulations majorize the original SOC and RSOC problem. This means that the regularized solution can be iterated to retrieve the original solution. Furthermore, we identify a structurally synchronized case of the risk-seeking soft-policy RSOC formulation, wherein the policy and transition KL-regularization weights coincide. Remarkably, this specific setting gives rise to several powerful properties such as a linear Bellman equation, path integral solution, and, compositionality, thereby extending these computationally favourable properties to a broad class of control problems.

artificial intelligence, formulation, machine learning, (15 more...)

2512.06109

Country: Europe > Belgium (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Control Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)

Venkatesh, Nishanth, Malikopoulos, Andreas A.

Model-Based Reinforcement Learning Under Confounding

arXiv.org Artificial IntelligenceDec-9-2025

Abstract--We investigate model-based reinforcement learning in contextual Markov decision processes (C-MDPs) in which the context is unobserved and induces confounding in the offline dataset. In such settings, conventional model-learning methods are fundamentally inconsistent, as the transition and reward mechanisms generated under a behavioral policy do not correspond to the interventional quantities required for evaluating a state-based policy. T o address this issue, we adapt a proximal off-policy evaluation approach that identifies the confounded reward expectation using only observable state-action-reward trajectories under mild invertibility conditions on proxy variables. When combined with a behavior-averaged transition model, this construction yields a surrogate MDP whose Bellman operator is well defined and consistent for state-based policies, and which integrates seamlessly with the maximum causal entropy (MaxCausalEnt) model-learning framework. The proposed formulation enables principled model learning and planning in confounded environments where contextual information is unobserved, unavailable, or impractical to collect.

c-mdp, machine learning, reinforcement learning, (15 more...)

2512.07528

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

arXiv.org Artificial IntelligenceDec-1-2025

CAPE: Context-Aware Diffusion Policy Via Proximal Mode Expansion for Collision Avoidance

Yang, Rui Heng, Zhao, Xuan, Brunswic, Leo Maxime, Alban, Montgomery, Clemente, Mateo, Cao, Tongtong, Jin, Jun, Rasouli, Amir

In robotics, diffusion models can capture multi-modal trajectories from demonstrations, making them a transformative approach in imitation learning. However, achieving optimal performance following this regiment requires a large-scale dataset, which is costly to obtain, especially for challenging tasks, such as collision avoidance. In those tasks, generalization at test time demands coverage of many obstacles types and their spatial configurations, which are impractical to acquire purely via data. To remedy this problem, we propose Context-Aware diffusion policy via Proximal mode Expansion (CAPE), a framework that expands trajectory distribution modes with context-aware prior and guidance at inference via a novel prior-seeded iterative guided refinement procedure. The framework generates an initial trajectory plan and executes a short prefix trajectory, and then the remaining trajectory segment is perturbed to an intermediate noise level, forming a trajectory prior. Such a prior is context-aware and preserves task intent. Repeating the process with context-aware guided denoising iteratively expands mode support to allow finding smoother, less collision-prone trajectories. For collision avoidance, CAPE expands trajectory distribution modes with collision-aware context, enabling the sampling of collision-free trajectories in previously unseen environments while maintaining goal consistency. We evaluate CAPE on diverse manipulation tasks in cluttered unseen simulated and real-world settings and show up to 26% and 80% higher success rates respectively compared to SOTA methods, demonstrating better generalization to unseen environments.

artificial intelligence, machine learning, trajectory, (15 more...)

2511.22773

Country: North America > Canada (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Transportation (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.92)

Neural Information Processing SystemsNov-20-2025, 19:07:32 GMT

Adaptive Path-Integral Autoencoders: Representation Learning and Planning for Dynamical Systems

Jung-Su Ha, Young-Jin Park, Hyeok-Joo Chae, Soon-Seo Park, Han-Lim Choi

Such learning problems are formulated as latent or generative model learning assuming that observations were emerged from the low-dimensional latent states, which includes an intractable posterior inference of latent states for given input data.

artificial intelligence, deep learning, machine learning, (16 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Cao, Yukang, Moorthy, Rahul, Poyrazoglu, O. Goktug, Isler, Volkan

C-Free-Uniform: A Map-Conditioned Trajectory Sampler for Model Predictive Path Integral Control

arXiv.org Artificial IntelligenceOct-21-2025

Trajectory sampling is a key component of sampling-based control mechanisms. Trajectory samplers rely on control input samplers, which generate control inputs u from a distribution p(u | x) where x is the current state. We introduce the notion of Free Configuration Space Uniformity (C-Free-Uniform for short) which has two key features: (i) it generates a control input distribution so as to uniformly sample the free configuration space, and (ii) in contrast to previously introduced trajectory sampling mechanisms where the distribution p(u | x) is independent of the environment, C-Free-Uniform is explicitly conditioned on the current local map. Next, we integrate this sampler into a new Model Predictive Path Integral (MPPI) Controller, CFU-MPPI. Experiments show that CFU-MPPI outperforms existing methods in terms of success rate in challenging navigation tasks in cluttered polygonal environments while requiring a much smaller sampling budget.

artificial intelligence, machine learning, trajectory, (16 more...)

2510.16905

Country:

North America > United States > Texas (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)